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Abstract. We prove a variant of the multidimensional polynomial Szemeredi theorem of 
Bergelson and Leibman where one replaces polynomial sequences with other sparse sequences 
defined by functions that belong to some Hardy field and satisfy certain growth conditions. 
We do this by studying the limiting behavior of the corresponding multiple ergodic averages 
and obtaining a simple limit formula. A consequence of this formula in topological dynamics 
shows denseness of certain orbits when the iterates are restricted to suitably chosen sparse 
subsequences. Another consequence is that every syndetic set of integers contains certain non- 
shift invariant patterns, and every finite coloring of N, with each color class a syndetic set, 
contains certain polychromatic patterns, results very particular to our non-polynomial setup. 



1. Introduction 

In [19], Purstenberg gave an ergodic theoretic proof of Szemeredi's theorem on arithmetic 
progressions, and using similar methods, Furstenberg and Katznelson [21] proved a multidi- 
mensional extension of Szemeredi's theorem. Later on, Bergelson and Leibman [7] gave a 
polynomial extension of this result, a special case of which states that given any collection of 
polynomials pi,. . . ,pi: N — ?■ Z, with zero constant term, and vectors vi, . . . , G Z*^, every 
subset of Z*^ of positive upper density contains configurations of the form 

(1) {v, v-hpi(n)vi, ... ,\r +pi{n)\ri} 

for some v G Z*^ and n G N. In the course of proving this result they introduced and studied 
the limiting behavior in L^(/i) of the following multiple ergodic averages 

1 ^ 

(2) j^^fi{Tr^''^x)... MT^^^-^x), 

n=l 

where Ti , . . . , : X — )• X are invertible commuting measure preserving transformations acting 
on some probability space {X, X , ^) and fi, ■ ■ ■ , ft G L°^{fi). Their goal was to prove a multiple 
recurrence property, namely, that for every A £ X with fJ-{A) > one has 

1 ^ 

(3) liminf ±-y^{An T-^^^^^A n • • • n rrP^("U) > o. 

n=\ 

Prom this, the combinatorial result follows via the correspondence principle of Furstenberg [19, 
20] . Bergelson and Leibman managed to prove this multiple recurrence property without getting 
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very precise information about the limit of the averages (2). Important role in their proof played 
an ergodic structure theorem (already present in [21]) and the coloristic counterpart of their 
density result, now known as polynomial van der Waerden theorem, which they proved using 
more elementary methods.^ The reader can find several other examples were ergodic methods 
were used to prove combinatorial results in the surveys [3, 4, 28, 29]. 

In the present article, we establish a variant of the polynomial Szemeredi theorem where 
one replaces the polynomials pi, . . . with a collection of sparse sequences of integers defined 
using functions that belong to some Hardy field and satisfy certain growth conditions. For 
instance, we show that one can substitute the configurations (1) with configurations of the 
form 

{v, v+Ki]vi, ... ,v+K^]v^} 

for every choice of distinct positive non- integers ci, . . . , q. Despite the similarity of this result 
with the polynomial Szemeredi theorem, its proof is very different. This is mainly because we 
arc unable to prove the corresponding coloristic result in a simple way (the only proof wc know 
uses the density result). To circumvent this problem, we deviate from the classical methods used 
in [7, 21], and aim at proving the needed multiple recurrence property by obtaining a complete 
understanding of the limiting behavior of the corresponding multiple ergodic averages. In our 
particular setup, we establish the following explicit limit formula 

1 ^ 

n=l 

where ci, . . . ,q are distinct positive non-integers, the convergence takes place in L'^in), and 
fi is the orthogonal projection of the function /j on the subspace of functions that are left 
invariant by the transformation Tj. The proof of identity (4) relies on ergodic decomposition 
results, seminorm estimates, and equidistribution results on nilmanifolds. 

BccaTise of the explicit evaluation of the limit in (4) , it is a simple matter to prove a multiple 
recurrence property analogous to (3), with an explicit lower bound, namely, 

(5) hm ^ f; M(A n Tr'-'U n . . . n T-["^^U) > {^{A)Y^' 

iv— >oo iV 

n=l 

where, as usual, ci, . . . , q are distinct positive non-integers. 

We remark that identity (4) and estimate (5) fail if one of the numbers ci , . . . , q is an 
integer greater than 1. This is a known feature of polynomial sequences caused by their lack 
of equidistribution in congruence classes. In this respect, fractional powers, as well as other 
sequences that we consider next, are better suited for the problems we are interested in. 

The method of proof of (4) allows us to work in a much more general setup. We prove that 
the place of the sequences [n'^^], . . . , [n'^*] can take any collection of sequences [ai(n)], . . . , [a£(n)], 
where the functions ai{t), . . . , ai{t) belong to some Hardy field, have different growth rates, and, 
roughly speaking, grow like a fractional power of t (for the exact statements see Theorems 2.3 
and 2.4). For instance, we can use the following collection of sequences 

{[n^ilognf^], ... ,[n^(logn)''^]} 

^When Ti = ■ ■ ■ = Te, using deep results from [25, 26, 31, 36], property (3) was proved in [8] without appealing 
to the polynomial van der Waerden theorem. No such proof for general commuting transformations is known. 
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where c is a positive non-integer and di, . . . , € M are distinct, or more exotic collections like 
{[^, [nVn^ + l], [n3/2eVI^^i^], [nVlogn], £ e^Ut\y 

Another interesting consequence of the limit formula (4) is in topological dynamics. It 
enables us to show, for instance, that if T, S are commuting minimal transformations acting on 
a compact metric space {X,d), and a,b are distinct positive non-integers, then for a residual 

set of X e X one has 

((rKlx,5K]x))^gj^ = XxX 

Periodic systems show that this fails if either a or 6 is an integer greater than 1. 

The limit formula (4) also has some rather unusual consequences in combinatorics. It implies 
that if C N is syndetic (i.e. finitely many translates of E cover N), then it contains certain 
non-shift invariant patterns, for instance, we prove that for a, b as before, the system 

2y-x = [n"] 

3z-x= [n^] 

has a solution with x,y,z £ E and n £ N. It also implies that for every finite coloring of N, 
where each color class is a syndetic set, the system 

y — X = [n"] 

z — X = [n^] 

has a solution with x, y, z having arbitrary colors. Again, these results are very particular to 
our non-polynomial setup and fail if either a or 6 is an integer greater than 1. 
In the next section we give a precise formulation of our main results. 

2. Main results 

2.1. Our setup. In order to properly state our results we have to first introduce some notation. 

A system {X, X , fj,,Ti, . . . ,T£) is a Lebesgue probability space {X,X,ii) together with a 
collection of commuting invertible measure preserving transformations Ti,...,Tf: X — t- X. 
By E(/|XtJ we denote the conditional expectation on the a-algebra of Tj-invariant sets. 
Equivalently, this is the orthogonal projection on the closed subspace of Tj-invariant functions. 

Throughout the article we use the symbol Ti to denote a translation invariant Hardy field (all 
notions defined in Section 3.1). All iterates of the transformations involved in our statements 
are defined using functions that belong to the same Hardy field. This particular setup enables 
us to work within a rich class of functions and offers several aesthetic and technical advantages. 

In most instances, we restrict our attention to the following "good" class of functions: 

Definition 2.1. We denote by G the collection of all functions a: [c, oo) — t- M that satisfy the 
growth conditions \a{t)\/(t'^ log t) — t- oo and \a{t)\/t'^^^ — t- as t — t- oo for some integer d>0. 

The presence of the logarithm on the first condition is purely for technical reasons, it ensures 
that successive differences of functions in Q CiH either converge to or else are functions with 
substantial growth (this follows from Lemma 3.2). The key features of functions in Q are: (i) 
they do not grow very fast, and (ii) they "stay away" from all polynomials in a rather strong 
sense. Staying away from polynomials is a property that we desire since the conclusions of our 
main results fail for some polynomials with integer coefficients. 
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2.2. Results in ergodic theory. For the sake of brevity we define: 

Definition 2.2. The functions ai, . . . , : [c, oo) M are said to have different growth rates if 
their pairwise quotients converge to ±00 or to 0. 

2.2.1. The limit formula. The main result of this article is the following limit formula (a special 
case of this was stated as Problem 6 in [16] and as Problem 29 in [17]): 

Theorem 2.3. Let % he a Hardy field and ai, . . . ,ae £ GCiT-L be functions with different growth 
rates. Then for every system {X, X , iJ,,Ti, . . . , Tg) and functions fi, . . . , fi G L'^ilj) we have 

(6) ^lirn^ ^ ^rf ^("^Vi ■ ■ • = h---h 

n=l 

where fi := E(/j|XrJ = lim7v-s>oo ^n=i '^Pfi '^^^ convergence takes place in L'^ip)- 

The case i = 1 follows from the equidistribution results in [10] and the case where all the 
iterates have sub-linear growth follows form [16] (this case is simple and no commutativity of 
the transformations is needed) . When all the transformations are equal a slightly weaker result 
is proved in [16].^ Easy examples of rational rotations on the circle show that for £ > 2 the limit 
formula (6) fails when the iterates are given by polynomial sequences, even if these polynomials 
have distinct degrees. In fact, it fails if some non-trivial linear combination of the functions 
ai , . . . , is a polynomial different than ±t + c. When the assumption that the transformations 
commute is removed, and two or more iterates have super-linear growth, examples from [18] 
show that the limit in (6) does not in general exist. Lastly, we remark that in (6) the limit 
limjv^>oo ;^ X^n=i cannot be replaced by the uniform limit limjv-M^.oo n-m Sn=M- This is 
because for a e HCiQ one can show that the sequence ([a(ra)]) takes odd (respectively even) 
values in arbitrarily long intervals of integers. 

2.2.2. Multiple recurrence. Using Theorem 2.3 we easily deduce the following: 
Theorem 2.4. Under the assumptions of Theorem 2.3, if Aq, Ai, . . . , Ag £ X satisfy 

/x(Ao n 1 Ai n • • • n t^^a^) = a > o 

for some ki, . . . ,kj^ E 1^, then 



(7) lim 1 Yl ^(^0 n r-["^^")Ui n • • • n t-^"'^''^^ a^) > a'+\ 



Proof. By Theorem 2.3 it suffices to show that 



Since each function E(l/i-|X7--) is Tj-invariant, the left hand side is greater than 



/ 



/ • E(/|Xti) • • • E(/|Xt,) dii>n f df^y^' = a'+\ 



where / = 1 kg and the last estimate follows from Lemma 1.6 in [13]. □ 

jT-I M* * *ri-/ £ A£ 

Even in the case where all the transformations are equal, our present argument has a technical advantage 
over the argument used in [16]. This enables us to relax the growth condition used there. 
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Hence, the limit in (7) is positive if ^(^o) > and //((J^g^ ^i) = 1 for i = 1, . . . , 
Applying Theorem 2.4 for = • • • = = ^ and ki = ■ ■ ■ = = we deduce: 

Corollary 2.5. Under the assumptions of Theorem 2.3, for every set A E X we have 

1 ^ 

(8) lira^ r-["^(")U n • • • n T-I^^^'^U) > {fi{A)Y+'. 

n=l 

Comments similar to those made after the statement of Theorem 2.3 apply here too. Fur- 
thermore, if = 2 and ai = 02, then no power of fJ,{A) can be used as a lower bound in (8) (see 
Theorem 2.1 in [6]). 

2.3. Results in topological dynamics and combinatorics. Let {X, d) be a compact met- 
ric space and Ti, . . . , : X ^ X be invertible commuting continuous transformations. There 
exists a Borel measure that is left invariant by all transformations. If in addition every trans- 
formation is minimal (i.e. {T-^x)n&N = X for every x £ X), then this measure gives positive 
value to every non-empty open set, and for every x £ X and non-empty open set U the set 
{n G N: T[^x G U} has bounded gaps (see for example [20]). As a consequence, for every x £ X 
and non-empty open set U we have limjv-*-oo J2n=i '^u{T^x) > 0, and using Theorem 2.3 we 
get for almost every x e X (and hence for a dense set of a; G X) that 

1 ^ 

hmsup ^ lt;,(rf • • • lc/,(rj"^(")lx) > 

whenever the sets Ui, . . . ,U£ are taken from a given countable basis of non-open sets. Using this, 
we deduce the following (the set oi x e X for which (9) holds is trivially Gs and Tj-invariant): 

Theorem 2.6. Let % he a Hardy field and ai, . . . ,ag £ QCiTi be functions with different growth 
rates. Let {X, d) he a compact metric space and Ti , . . . , : X — >■ X he invertible commuting 
minimal transformations. Then for a residual and Ti-invariant set of x G X we have 

(9) {(r^("^lx,...,r]"*("^lx): nGN} =Xx---xX. 

Examples in [33] show that even when i = 1 identity (9) may fail for an uncountable set 
oi x G X. In fact, for every sequence of integers (a(n)) with zero density, it is shown in [33] 
that there exists a totally minimal and uniquely ergodic topological dynamical system {X, d, T) 
such that for an uncountable set oi x £ X one has x ^ {T''(")x,n £ N}. Examples of minimal 
rotations on finite cyclic groups show that if p G Z[i] is any polynomial / =bi-|-c, then one may 
have {TPHj;,ra G N} ^ X for every x£X. 

Every continuous transformation T on a compact metric space (X, d) has a non-empty closed 
T-invariant set Y (Z X such that the transformation T: y — > y is minimal (see for example 
[20]). Using this, and Theorem 2.6 for Ti = • • • = = T, we deduce: 

Corollary 2.7. Let % be a Hardy field and ai,...,ai £ Q CiH. he functions with different 
growth rates. Let {X, d) he a compact metric space and T: X X be an invertible continuous 
transformation. Then for a non-empty and T-invariant set of x £ X we have 

(10) {(r[«iHlx, ...,r[«^Hlx): nGN} = {T"x: n G N} x ••• x {T^x: n G N}. 

Again, simple examples show this result fails if£ = l and p £ Z[t] is any polynomial ^ ±t-\-c. 
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2.4. Combinatorial consequences. For a set A C Z'^, we define its upper density by d{A) := 
limsup^v^oo l-'^ l~l [—N,N]'^\/{2N)'^ (any other shift invariant mean works for our purposes). 

Combining the previous multiple recurrence result with a multidimensional version of Fursten- 
berg's correspondence principle [20], we deduce the following consequence in combinatorics: 

Theorem 2.8. Let 7i be a Hardy field, ai,. . . ,ai G G CiTi be functions with different growth 
rates, and Vi, . . . , G Z*^ be vectors. Suppose that the sets Eq, Ei, . . . , E( cl^"^ satisfy 

d{Eo n (^1 + fci) n • • • n {Ee + ki)) = a > 

for some ki, . . . ,ki E 1,. Then 
1 ^ - 

liminf - V d{Eo n (El - [oi(n)]vi) D ■ ■ ■ D {Ee - K(n)]v^)) > a^+\ 
N-¥oo iV 

n=l 

Using this ioi Eq = ■ ■ ■ = E£ = E and ki = ■ ■ ■ = k^ = 0, we get the following strengthening 
of the combinatorial result advertised in the introduction: 

Corollary 2.9. Let % be a Hardy field, ai, . . . ,a£ G G CiH be functions with different growth 
rates, and vi, . . . , G Z*^ be vectors. Then for every set E d we have 

1^ ^ry,d{EC^{E- [ai(n)]vi) C^■■■C^{E- k(n)]v^)) > {d{E)Y^\ 

n=l 

Theorem 2.8 is also non- vacuous for syndetic sets Eq, . . . , Ef C N (in this case a can be as 
(ni=o*j)^^ where Sj is the syndeticity constant of the set Ei) and gives the following: 

Corollary 2.10. Let H be a Hardy field and ai,. . . ,ae G Q HH be functions with different 
growth rates. Let Eq, Ei, . . . , E^ G N be syndetic sets. Then there exist m, n G N such that 

me Eo, m+ [ai{n)] e Ei, ... , m + [ae{n)] G Ee. 

Corollary 2.10 enables us to solve some non-shift invariant systems of equations within every 
syndetic set. For instance, for a syndetic set C N, we can take Eq := cE, Ei := CiE, 
i = 1, . . . ,£, where c, Cj are arbitrary positive integers and cE := {ck, k G E}, and deduce that 
the system of equations 

ciXi — cxo = [ai(n)] 
C2X2 — cxo = [02 (n)] 



cexe - cxo = [aein)] 

has a solution with xo,xi, . . . ,xe G E and n G N.^ Another consequence is that for any finite 
coloring of N, where each color class is a syndetic set, the previous system has a solution 
with xq,. . . ,xe having arbitrary colors. In other words, if the colors classes are denoted by 
Co, . . . , Ck, we can have xq G Cjg, . . . ,xe & Ci^, where io, . . . , G {1, . . . , fc} are arbitrary. 

Similar results fail for polynomial sequences and also fail when the set E is only assumed to be piecewise 
syndetic. Easy examples show that: (i) If p € is any polynomial different than ±t + c and A; € N is different 
than 1, then the equation kx — y = p{n) has no solution with x, y belonging in some set E that is an arithmetic 
progression, (ii) If (a(n)) is a sequence of integers with a{n + 1) — a(n) — ^ 00 and fc ^ 1, then there exists a 
thick set E such that the equation x — ky = a{n) has no solution with x,y € E. 
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2.5. Key ingredients and proof plan. 

2.5.1. Key ingredients. The proof of Theorem 2.3 uses the following key ingredients: 

Gowers-Host-Kra seminorms. These are non-negative numbers associated with every bounded 
measurable function (see Section 3.2). They were defined in a combinatorial setting in [22] 
and in an ergodic setting in [25]. We seek to control the L'^{h) norm of the multiple ergodic 
averages in (6) by the seminorms of the individual functions involved. 

Van der Corput's Lemma. This elementary estimate, and variations of it (see Section 3.5), is 
the key ingredient used to get the desired scminorm estimates. 

Decomposition results. These are used to replace sequences of the form (f(T'^x)) with sequences 
that have more desirable properties. We use two decompositions, one involving dual sequences 
(Proposition 3.4), and another, much deeper one, involving nilsequences (Theorem 3.5). Both 
decompositions originate from [25]. 

Equidistrihution results on nilmanifolds. These are used towards the end of our argument when 
one replaces sequences of the form {f(T'^x)) with nilsequences. They enable us to carry out 
the finer analysis needed to prove identity (6). The equidistrihution results were proved in [15] 
using results from [23] on quantitative equidistrihution of polynomial sequences on nilmanifolds. 

2.5.2. Combining the key ingredients. Crucial to the proof of Theorem 2.3 are some seminorm 
estimates showing that the limit in (6) is when at least one of the functions involved is 
"uniform enough". We establish these estimates in two steps. First, we prove them for the 
function that is associated with the fastest growing iterate (Propositions 4.1 and 5.2). This 
part of the proof borrows ideas from [14] in order to devise an appropriate inductive scheme 
(similar to the PET induction of [2]) based on successive uses of van der Corput's Lemma. 
Next, we use this first step, and the decomposition result of Proposition 3.4, in order to replace 
one of the functions with a function that (when evaluated in the orbit of the corresponding 
transformation) gives rise to sequences (called dual sequences) defined by a certain averaging 
operation. It is then possible to devise another induction based again on successive uses of 
van der Corput's lemma and produce seminorm estimates for the function associated with 
the second fastest growing iterate (Proposition 6.2). Continuing like this, we get seminorm 
estimates for all the functions (Proposition 7.1). 

Using the seminorm estimates and the decomposition result of Theorem 3.5, we get that the 
limit in (6) remains unchanged when we replace each function with a function that pointwise 
gives rise to nil-sequences. At this advanced point in the proof, we are in position to apply 
known equidistrihution results on nilmanifolds from [15] to complete the proof of Theorem 2.3. 

For technical reasons, complications arise in implementing the previous plan when one or 
more iterates have sub-linear growth. These complications are handled using a variant of 
the aforementioned seminorm estimates (Proposition 7.3) and the equidistrihution results on 
nilmanifolds (Proposition 7.5). 

Recently, a relatively simple method for proving mean convergence of the polynomial averages 
(2) was developed in [35] (based on ideas from [34]), but up to this point it has not been 
successful in giving detailed information for the limiting function. Since the precise form of the 
limit is the most crucial part of our main result, and is needed for applications, it seems that 
we are forced to carry out the more refined analysis summarized above. 
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2.6. Further directions. We believe that in Theorem 2.3 (and its various consequences) the 
restrictions we impose on the functions ai, . . . ,a£ can be weakened considerably. We record 
here a related problem (a special case of this already appears in [16, 17]): 

Problem 1. Given a Hardy field %, show that the conclusion of Theorem 2.3 holds if the 
functions ai, . . . ,a£ G have polynomial growth rate and every non-trivial linear combination 
a{t) of these functions satisfies \a{t) — cp{t)\/ log t — >■ oo for every c G M and every p G Z[t] . 

When £ = 1 the result follows from the equidistribution results in [10]. The problem is open 
even when i = 2 and Ti = T2. 

When the sequences ai,...,a£ are equal, the methods used in this article do not seem 
particularly helpful in studying the limiting behavior of the averages in (6) (mainly because 
the seminorm estimates we use here fail in this case). We record a related problem (a special 
case of this already appears in [16, 17]): 

Problem 2. Let a : [c, 00) — > R 6e a Hardy field function with polynomial growth rate that 

satisfies \a{t) — cp{t)\/ logt — 00 for every c G M and every p £ Z[t]. Show that for every 
system {X, X , ii,Ti, . . . , T() and functions fi, ■ ■ ■ , ft & L°°{fj,) the averages 

1 ^ 

(11) ^I]rf^"^'/i---Ti"^"V. 

n=l 

converge in Lp'{n) and their limit is limjv->^oo 'Y^n=i ^"/i ' ' ' ^"/f ('this limit exists [34] 

The case where Ti,. . . ,T( are powers of a single transformation was treated in [16]. In the 

generality stated, the problem is open even when £ = 2 and a{t) = t^/^. 

Regarding pointwise convergence of the averages in (6), progress has been very scarce. The 
case 1 = 1 was treated in [11], but other than this, even the simplest cases remain open. 

Problem 3. Let a, b be distinct positive non-integers. Show that for every ergodic system 
{X, X , fi,T) and functions f,g E L°°{ijl), we have 

N 

^5;/(T[""]x) •^(rl'^'Jx) = j f diJi- I gdix 

n=l 

for almost every x E X. 

All cases where both a and b are greater than 1 are open. 

2.7. Notational conventions. The following notation will be used throughout the article: 
N = {1, 2, . . .}, r/ = / o T, r'^ = T o • • • o T, Ig is the indicator function of a set E, C^z is 
2 if fc is even and z if z is odd. We often write 00 instead of +00. If a{t),b{t) are real valued 
functions defined on some half-line [c, 00) we write a{t) -< b(t) if a{t)/b(t) — > as t — t- 00. We 
write a{t) <C b{t) if there exists C G M such that |a(i)| < C\b{t)\ for all large enough t G M, and 
a ~ 6, if a{t)/b{t) converges to a nonzero constant as t — > 00. We denote by Sha the function 
defined by {Sha){t) = a{t + h). A function a: [c, 00) — t- M has degree d <^ a{t) ^f^+K By 
H we denote a translation invariant Hardy field and by Q the set of functions a: [c, 00) — t- M 
that satisfy t^ logt -< a{t) -< t^^^ for some integer fc > 0. If {X, X, fi, T) is a system, Zt denotes 
the cr-algebra of T- invariant sets and E(/|It) the conditional expectation on It. 
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3. Background Material 
In this section wc put togctlicr some background material that we use throughout this article. 

3.1. Basic facts about Hardy fields. Let B be the collection of equivalence classes of real 
valued functions defined on some half line [c, oo), where we identify two functions if they agree 
eventually.^ A Hardy field H is a subfield of the ring {B, +, •) that is closed under differentiation 
(a term first used by the Bourbaki group in [12]). A Hardy field function is a function that 
belongs to some Hardy field. We are going to assume throughout that all Hardy fields mentioned 
are translation invariant, meaning that if a{t) G T-L, then a{t + h) eT-L for every h G M). 

A particular example of such a Hardy field is the set C£ that was introduced by Hardy 
in [24] and consists of all logarithmic- exponential functions, meaning all functions defined on 
some half line (c, oo) by a finite combination of the symbols +, — , x, :,log,exp, operating on 
the real variable t and on real constants. For example functions such as t"^, t(logt)^, e*^, 
gViog logy log(t^ + 1), are all elements of C£. Another, even more extensive example was 
constructed by Boshernitzan in [9]. It satisfies the following properties: 

• it contains the set C£; 

• it is closed under integration; and 

• it is closed under composition of functions that increase to infinity. 

Every Hardy field function is eventually monotonic. If one of the functions a, b: [c, oo) — t- M 
belongs to a Hardy field, and the other function belongs to the same Hardy field or to C£, then 
the limit limt_s.oo CL{t)/b{t) exists (possibly infinite). This property is key and will often justify 
our use of I'Hopital's rule. We are going to freely use all these properties without any further 
explanation in the sequel. The reader can find further discussion about Hardy fields in [9, 10] 
and the references therein. 

Definition 3.1. We say that two functions a,b: [c, oo) M have the same growth rate, and 
write a 6, if a{t)/b(t) converges to a nonzero constant as t — > oo. We say that the function 
a: [c, oo) — >■ R has polynomial growth rate if a{t) -< t^ for some A; G N. 

Notice that if the functions a, b belong to the same Hardy field, then one of the following three 
alternatives holds a b, b a, a ^ b. A key property of Hardy field functions with polynomial 
growth is that we can relate their growth rates with the growth rates of their derivatives: 

Lemma 3.2. Let a: [c, oo) — M 6e a Hardy field function with polynomial growth, 
(i) If a y 1, then a' <C a/t. 

(a) If a >- t^ for some e > 0, then a' ~ a/t and for every non-zero h E we have 
S^a — a ~ a/t. 

Proof. Applying I'Hopital's rule we get 

(12) lim ^ = lim ^'^f\<f' = lim 

t-5>oo a\t) t^oo (logt)' t^oo logt 

Since a{t) has polynomial growth, the last limit is a non-negative real number. Hence, a' <^ a/t. 



The equivalence classes just defined arc often called germs of functions. Wo choose to use the word function 
when we refer to elements of B instead, with the understanding that all the operations defined and statements 
made for elements of B are considered only for sufficiently large values of t £ R. 
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If furthermore one has -< a{t) for some e > and a{t) has polynomial growth, then the 
previous limit is a positive real number. This implies that a' ~ a/t. Lastly, suppose that /i > 
(a similar argument applies if /i < 0). The mean value theorem gives that 

a{t + h)-a{t) = haliit) 

for some € \t,t + h]. Applying I'Hopital's rule wc get a'{^t)/a'{t) ^ a(^f)/a(f) and one easily 
sees that a(^t)/a(i) 1. Combining the above we get S^a — a ^ a'. The proof is complete 
since by the first claim a' ^ a/t. □ 

3.2. Basic facts from ergodic theory. A system (X, , T) is a Lebesgue probability space 
{X,X,ij) together with an invertible measure preserving transformations T: X ^ X. 

The ergodic theorem. The ergodic theorem states that for every system {X, X , fi,T) and func- 
tion / G L^ifi) we have for almost every x e X that 

1 ^ 

n=l 



where / = E(/|Xt) and 



lT:={AeX: f^{T~'^AAA) = 0}. 



Gowers-Host-Kra uniformity seminorms. Following [25], where a similar definition was given 
for ergodic systems, given a system {X, X, ij,, T) and a function / G L°°{fi), we define inductively 

:= ||E(/|Xt)IL2(^); 

1 ^ 

(13) |||/||ir+Y,T := ^ Ei/ • 

n=l 

That all limits exist and ||| ■ 1^^^ is a seminorm can be proved as in [25]. Furthermore, the limit 
in (13) remains unchanged if replaced with the uniform limit limjv-M^oo n-m Sn=M- Using 
the ergodic theorem one gets |||/|||^ j, = limjv^oo jf J2n=i I f ' more generally, that 

n/c=l m=l ee{0,l}'= 

where n = (ni, . . . , n^) and for e G {0, 1}*^ we let 

n-e:=nieiH hnfeefe, [el = ei H h e^, 

and for z G C and A; nonnegative integer we let 

z if A; is even 



if k is odd. 



It follows from Theorem 13.1 in [25] that in (14) the iterative limit can be replaced with the 
limit limjv-)-oo ]^ Yli<ni nk<N- Using (14) and the ergodic theorem one can check that 

(15) lll/®7llk,Txr<|||/|||^+i,T 

holds for every A; G N. We also remark that |||/|||fe,T < |||/|||ifc+i,T holds for every A; G N. 
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3.3. Dual functions, dual sequences, and weak decomposition. 

3.3.1. Dual functions. Let {X,X,iJ,,T) be a system, / G L°^ip,), and M G N. We define 

^m(/):=^ e n c'^'^^-v- 

me[l,M]fcee{0,l}*, 
e^00---0 

It is shown in [25] that the averages AM{f) converge in L^{fi) and in [1] that they converge 
pointwise. We define 

Vk,Tf ■■= lim AMif) 
and call any such function a level k dual function. For instance, we have 

The importance of dual functions in the current article stems from the following result (it 
follows from (14) and the fact that the iterative limit can be replaced with a limit over cubes): 



Proposition 3.3. Let {X, X, n, T) he a system. Then for every f G L°°{n) and A; G N we have 

f-Vk,Tf d^i=\\fllT- 

As a consequence, |||/|||A;,r 7^ if and only if / positively correlates with some dual function 

of level k. 

3.3.2. Dual sequences. Adual sequence of level k is a sequence {V{n)) of the form 
V(n):= lim y TT C^^Uin + m ■ e), 

me[l,M]fcee{0,l}'=, 

£7^00- -o 

where (d(n)) is a bounded sequence such that the above limit exists for every n G N. 
For future use, we record the identity 

(16) ^(") = JTooil^ E n CNd.(m + n6) 

me[l,M]'=ee{0,l}'=, 

€7^00- --o 

where e is any vector in {0, 1}''' such that e • e = 1 and 

dg(m) = d{e ■ m). 

For instance, if {V{n)) is a dual sequence of level 2, then 

V(n) = lim ——r d(n + mi) ■ din + ■ d(n + mi + 

l<mi,m2<N 

= lim -irr di(mi + n, 777-2) • d2(mi,m2 + n) • d3(mi + n,m2), 

M-)-oo — ' 

l<mi,m2<N 

where 

di(777i, 7772) := d(mi), di (7771, 7772) := d(m2), d3 (777i, 7772) := d(mi + 7772). 
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3.3.3. Weak decomposition. For the purpose of this article the significance of the collection of 
dual sequences stems from the following decomposition result: 

Proposition 3.4 (Weak decomposition). Let {X,X, i2,T) be a system, f G L°^{ijl), and A; G N. 
Then for every £ > 0, there exist functions fs,fmfe € L°°{ii), such that 

(1) f = fs + fu + fe; 

(2) |||/«|||fc = 0; ||/e||ii(^) <e; and 

(3) fs = Yli^Li Ci fs,i, where Cj G M, fs,i G L°°(ii), and for almost every x & X the sequence 
{fs,i{T"'x))neN is 0, dual sequence of level k. 

Proof. Let e > 0, A; G N, and / G L°°{iJ,). We construct an invariant sub-cr-algebra of X 
exactly as in Section 4 of [25]. It satisfies the same property as in Lemma 4.3 of [25], namely, 

(17) for f G L°°(/x), E{f\Zk-i) = if and only if \lf\U = 0. 

We can decompose f as f = fu + g where g = E(/|2^fc_i) and /„± L°°{Zk-i, ii). It follows 
from (17) that |||/w|||fe = 0. It is clear that G L°°{n). 

We claim that linear combinations of dual functions of level k are dense in L^{Zk-i, l-i). 
Indeed, by duality, it suffices to show that if / G L°° {Z^^i, fJ,) satisfies J f ■ T>k^Tf d/i = for 
every / G L°°{ii), then / = 0. Taking f = f gives J f ■ Vj^^Tf dfi = 0, and by Proposition 3.3 
we get Ill/life = 0. Since / G L°°{Zk-i, /j,), we deduce from (17) that / = 0. This completes the 
proof of the claim. 

Keeping in mind that g G L°° (Z^-i, fJ.), the claim enables us to decompose g as g = fs + fe, 
where fs is a finite linear combination of dual functions of level k and ||/e||Li(/i) ^ ^- Since 
the function g and all dual functions are bounded, the function /g is bounded. The proof ends 
upon noticing that if /i is a dual function of level k, then for almost every x E X the sequence 
{h{T'^x))neN is a dual sequence of level k. □ 

3.4. Nilsystems, nilsequences, and strong decomposition. A nilmanifold is a homoge- 
neous space X = G/T where G is a nilpotent Lie group, and F is a discrete cocompact subgroup 
of G. If Gfe+i = {e} , where Gk denotes the k-th commutator subgroup of G, we say that X is 
a k-step nilmanifold. 

A fc-step nilpotent Lie group G acts on G/T by left translation where the translation by a 
fixed element a G G is given by Ta{gT) = {ag)T. By mx we denote the unique probability 
measure on X that is invariant under the action of G by left translations (called the normalized 
Haar measure), and by Q /T we denote the Borel cr-algebra of G/T. Fixing an element a E G, 
we call the system {G/T,Q /T,mx ,Ta) a k-step nilsystem. The reader can find more material 
about nilmanifolds in [31] and the references therein. 

li X = G/T is a A;-step nilmanifold, a G G, x G X, and / G C{X), we call the sequence 
(/(a"a;))„gN a basic k-step nilsequence. A k-step nilsequence, is a uniform limit of basic k-step 
nilsequences. 

3.4.1. Strong decomposition. The next decomposition result will be crucial for our study. For 

crgodic systems it is a direct consequence of a deep structure theorem in [25]; the extension to 
the non-ergodic case was treated in [14] (see Proposition 3.1). 
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Theorem 3.5 (Strong decomposition). Let {X, X , n,T) be a system, f G L°°{ix), and A; G N. 
Then for every e > 0, there exist functions fs,fu,fe G Li°°{fj,), with L°°{fj,) norm at most 

2 II/IIloo(^), such that 

(1) f = f, + f^ + 

(2) = 0; ||/e|L2(^) < e; and 

(3) for almost every x & X the sequence (/s(T"x))„gN is a k-step nilsequence. 

3.5. The van der Corput Lemma. A key tool in proving uniformity estimates is the follow- 
ing variant of van der Corput's fundamental estimate (proved as in Lemma 3.1 in [30]): 

Lemma 3.6. Let N E N and vi, . . . ,vn be vectors in an inner product space. Then for every 
integer H between 1 and N we have ^{z) denotes the real part of a complex number z ) 

N H N 

1 V- ^2A. /i.^/lA \ 2 AH 



n=l h=l 

We also use the following qualitative variant: 



n=l 



Lemma 3.7. Let (vn) be a bounded sequence of vectors in an inner product space, and {^n) 
be a F0lner sequence of subsets ofN. Then 

2 



lim sup 



1 



1 ^ 1 

< 4 lim sup — V lim sup — — - V < v^+h, Vn > 



h=l 



In most cases we apply this lemma for = [Ij AT], N eN. 

4. SeMINORM ESTIMATES FOR THE HIGHEST DEGREE ITERATE: TWO TRANSFORMATIONS 

An important step towards establishing Theorem 2.3 is to obtain estimates that enable us 
to control the L'^{ji) norm of the averages in (6) by the uniformity scminorms of the individual 
functions. In this section and the next one, our goal is to do this for the function that is 
associated with the fastest growing iterate. In subsequent sections we utilize this information 
in order to get similar estimates for the other functions. 

Since the proof is notationally heavy, we choose to first present it in detail for the case of 
two commuting transformations. The argument that covers the general case is very similar and 
we sketch its proof in the next section. 

The main goal in this section is to establish the following result: 

Proposition 4.1. Let {X, X , fi,Ti,T2) be a system and /i,/2 € L°^{l-'-) be functions. Let H be 
a Hardy field, ai, a2 € Q DH be functions that satisfy ai y a2, and let d := deg(ai) (all notions 
are defined in Section Then there exists k = k{d) such that: If |||/i|||a:,Ti = 0, then the 

averages 



N 



l f rp[a2ir 



'/2 



n=l 



converge to in L^ [fi) . 

Our method necessitates that we prove a more general result that we present next. 
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Proposition 4.2. Let {X, A! , ij.,Ti,T2) be a system and fi, ■ ■ ■ , fm ^ be functions. Let 

{A,B) be a nice ordered family of pairs of functions with degree d (all notions are defined in 
Sections 4-1 and 4-^)- Then there exists k = k{d,m) G N such that: If |||/i|||fe,ri = 0, then 



(18) 



lim sup 

N-^oo Ecf^ 



N rn 



n=l i=l 



0. 



Applying this result to the nice family {A,B) defined by A ■= (ai,0) and B := (0,02), one 
sees that Proposition 4.1 follows from Proposition 4.2. 



4.1. Families of pairs of functions and their type. 



4.1.1. Degree and equivalence. 

Definition 4.3. If a: [c, 00) — )• M is a function with polynomial growth rate, and /cq is the 
smallest non-negative integer k such that a{t) -< t^, we define d := fco — 1 to be the degree of 
the function, and write deg(a) = d. 

As a consequence, deg(a) = —1 if and only if a{t) — >■ 0, and deg(a) = d > if and only 
if < a{t) -< t'^+^ For example, deg(l/i) = -1, deg(l) = deg(\/t) = deg(t/logt) = 0, 
deg(t) = deg(t^-^) = 1. 

We remind the reader that two functions a,b: [c, 00) — > M have the same growth rate, in 
which case we write a ~ 6, if a(t)/h{t) converges to a non-zero constant as t — >■ 00. We will 
make use of the following stronger notion of growth equivalence: 

Definition 4.4. Wc say that two functions a, b: [c, co) — t- M are equivalent, and write a = 6, if 
they have polynomial growth rate and satisfy deg(a — b)< min{deg(a), deg(6)}. 

Notice that if a = 6, then a{t)/b{t) 1, but the converse is not true. For example t^-^ ^ 
t'-' + t'-\ 



4.1.2. Families of pairs of functions. Let m G N. Given two ordered families of functions 

A:= {a-i,...,am), B := {bi, . . . ,bm), 

where ai,bi: [c, 00) — )■ M have polynomial growth rate, we define the ordered family of pairs of 
functions {A, B) as follows 

iA,B) := ((ai,6i), . . . , (0^,6^)). 

The reader is advised to think of this family of pairs as an efficient way to record the functions 
that appear in the iterates (18). 

The maximum of the degrees of the functions in the families A and B is called the degree of 
the family {A,B). 

For convenience of exposition, if pairs of bounded functions appear in {A,B) we remove 
them, and henceforth we assume: 

All families {A, B) that we consider do not contain pairs of bounded functions. 
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4.1.3. Definition of type. We fix a non-negative integer d and restrict ourselves to families 
{A, B) with degree between and d. 
Let 

(19) A' ■= {a e A: a is not bounded}, 
and 

(20) B' := {heB: ai is bounded}. 

For i = 0, 1, . . . , d, let wi^i, W2,i be the number of distinct non-equivalent classes of polyno- 
mials of degree i in A' and B' correspondingly (if B' is empty, then W2,i = for z = 0, 1, . . . , d). 
We define the (matrix) type of the family {A,B) to be the 2 x {d = 1) matrix 

/ Wl^d ■ • • Wlfi\ 
\w2,d ■ ■ ■ W2,0 J ■ 

For example, consider the family of pairs 

((t2-5,t3-5), (t^-^+t^t), (t2-5+tl-5,2t), (t0-5,t) ((t+ 1)0-5-^0.5^^1.5)^ (0,t°-5)). 

Then d = 3, A' = {t'^-^,t^-^ + t^t^.s ^ ^LS^^aSj^ qi ^ {ii-5,i0.5|_ ^ consequence, the 
family of pairs {A, B) has type 

/O 2 1\ 

\o 1 1 J ■ 

We order the set of all possible types lexicographically; we start by comparing the first element 
of the first row of each matrix, and after going through all the elements of the first row, we 
compare the elements of the second row of each matrix, and so on. In other words: given two 
2 X (d -I- 1) matrices W := (wij) and W := {w'^j), we say that W >- W if: wi^j. > w[ ^, or 
^i,ci = w'l ^ and wi^d-i > '^^i d-i' ■ ■ -i or wi^i = w[ j^ for z = 0, . . . , d and W2,d > '^'2 di ^^"^ 
As an example we mention 

where in the place of the stars one can put any collection of non-negative integers. 

An important observation is that although for a given type W there is an infinite number of 
possible types W that are smaller than PF, we have 

Lemma 4.5. Every decreasing sequence of types of families of pairs is eventually stationary. 

Therefore, if some operation reduces the type of a certain family of pairs of functions, then 
after a finite number of repetitions it will terminate. 

4.2. Nice families and the van der Corput operation. In this subsection we define a class 
of "nice" families of pairs of functions that will be instrumental for our subsequent discussion. 
Furthermore, we define an operation that sends nice families to nice families and reduces their 
type. 
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4.2.1. Nice families. We remind the reader of our definition of the class of good functions 

G = {a: [c, oo) M such that t^logt -< a{t) -< t"^^^ for some integer d > 0}. 

Definition 4.6. Given a function a: [c, oo) R, we define J^{a) to be the family of functions 
that contains all integer combinations of shifts of a, meaning, 

I 



Using Lemma 3.2, one sees that if o € ^ and b G J^{a), then either b{t) or b & Q. 
Henceforth, we are going to work with the following class of pairs of functions: 

Definition 4.7. Let ^ be a Hardy field, a,b & GCiT-L, ai & J^{a) and bi G J^{b) for i = 1, . . . , m, 
and .4 := (ai, . . . , am)-, B := {bi, . . . , bm)- We call the ordered family {A, B) nice if 

(1) ai — Qi 1 and <C ai for z = 2, . . . , m; 

(2) bi ~< ai ioT i = 1, . . . ,m; 

(3) bi — bi ~< ai — ai ioi i = 2, . . . ,m. 

For example, if H is a Hardy field, and a,b e Q CiH satisfy a y b, then the ordered family 

of pairs ((a,0), (0,6)) is nice. If in addition we assume that deg(6) > 1, then also the family 
((a, —b), {Sha, —b), (0, Shb — b)) is nice for every G N. This is a special case of a more general 
phenomenon that will be explained in Section 4.5. 

4.2.2. The van der Corput operation. Given an ordered family of pairs of functions {A,B), a 
pair of functions (a, 6), and /i G N, we define the following operation 



where * is the operation that removes all pairs of bounded functions. 

4.3. Strategy of proof of Proposition 4.2. Our proof strategy of Proposition 4.2 is to 
successively apply Lemma 3.7 in order to bound the L?'{lj) norm of the averages in question with 
the L2(^) norm of averages that are simpler to deal with. In order to carry out this reduction a 
key step is to show that given a nice family of pairs (.A, B) with deg(ai) > 1, it is always possible 
to find (a, b) G {A, B) such that for all large enough h £N the operation (a, b, h) -vdC leads to 
a nice family of pairs that has smaller type. Eventually, this procedure leads to families of pairs 
with sub-linear growth (i.e. with degree 0), in which case Proposition 4.2 can be established 
directly in a relatively simple manner. 

We explain how this reduction to the degree case works in the next example: 

Example. Our goal is to find A; G N such that if |||/i|||fe,ri = 0, then the averages 




1=1 



{a,b,h)-wdC{A,B) := 

{{Shai - a, Shbi - 6), ... , {Sna., 



•m 



a,Shbm - b), (ai - a,bi - b), . . . , {am -a,bm- b))* . 



(21) 




n=l 



converge to in 



(/x) as iV — )• oo. 
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We define A = B = {0,t^'^), and introduce the following nice family of pairs of 

functions 

(AH) = ((^l•^o),(o,^l■l)). 

This family is nice and has type (| q). Applying the vdC operation with (a, 6) = {0,t^-^), we 
see that for /i G N, the ordered family (a, b, h) -vdC(^, B) is equal to 

(((t + hY-\ -t^-i), (0, {t + hf-^ - t^-'),{t'-\ -t^-^)). 

The important point is that for every /i € N this new family is also nice and has smaller type, 
namely (q i)- Loosely speaking, one expects to be able to show (using Lemma 3.7) that the 
averages (21) converge to in L^(|Lt) once one can show that for every /t G N the averages 

n=l 

converge to in L^(ju) for all g,h ^ L°^{ijl). 

For /i G N, applying the vdC operation one more time with (a, 6) = (0, (i + /t)^-^ — t^-^) 
leads to a nice ordered family with 4 pairs and type (g q). Lastly, for /i G N, applying the vdC 
operation one more time with (a, 6) = {t^'^,—{t + h^'^), it is easy to see that we get a nice 
ordered family with 7 pairs and type ( [] q ) • Iii this case all functions involved have sub- linear 
growth, and the iterates of T grow faster than any of the iterates of S. Taking advantage of this 
fact, we can show in a relatively simple way that the corresponding multiple ergodic averages 
converge to in ^^(^u) if |||/|||i6,Ti = 0. 

4.4. Two technical lemmas. We establish two simple results that will be used repeatedly. 

Lemma 4.8. Let a: [c, oo) — t- R &e a Hardy field function with non-negative degree d and let 
b G F{a). Then either h{t) -> 0, or there exists G {0, . . . , d} such that b ~ a/t^. 

Proof. Without loss of generality we can assume that a{t) oo. Suppose that 

/ 

6 = ^ fcj • Sh^a. 

i=l 

Since deg(a) = d we have by Lemma 3.2 that a^'^^^\t) — t- 0. Using this and Lagrange's 
remainder formula for the Taylor series of the function a{t), we see that for /i G N we have 

d 

5,,a = ^a» h'/il + eh 

i=0 

where eh ■ [c, oo) — >■ M is a function that satisfies eh{t) 0. Combining the above identities we 
deduce that 

d 

b = Y^ + e 

i=0 

for some constants Cj G M and function e: [c, oo) M that satisfies e{t) — )■ 0. If q = for 
i = 0, . . . ,d, then b(t) 0. Otherwise, let iq be the smallest i such that Cj 7^ 0. Then b ~ a^'^°\ 
and by Lemma 3.2 we have a^^°^ ~ a/f°. Taking d = iq completes the proof. □ 
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Lemma 4.9. Let a: [c, oo) — )■ M 6e a Hardy field function with polynomial growth rate and 
ai,a2 G T{a) be such that ai >- t^ for some £ > and a2 <C ai. 

(i) If ai % 02, then S^ai — 02 ~ ai for every non-zero /i G M. 

(ii) If 0,1 = 02, then S^ai — 02 ^ ai/t for every /i G M, and S^ai — 02 ~ ai/t for all but 

one /i € M. 

Remark. The assumption ai, 02 G J^(a) is necessary. For (i) take ai(t) = t^'^ + a2(i) = t^'^, 
and for (m) take ai{t) = t^-^ + 1°-^, a2{t) = t^-^. 

Proof. We prove (i). Suppose on the contrary that Shai — 02 ai for some /i G M. Since 

ShOi — 02 ^ ai, we deduce that ShO-i — 02 ~< ai. 

We claim that ShO-i — a2 <^ ai/t. Indeed, by Lemma 4.8 we have ai ~ a/t^ for some non- 
negative integer k. Since Shai — 02 G ^(a), Lemma 4.8 gives that either ShO-i — 02 1, or 
ShCLx — a2 ^ ajt^ for some non- negative integer k' . If S^ax — 02 ^ 1, then the claim is proved 
because deg(ai) > 1. If S^ax — 02 ~ aj'^ , then since Shai — 02 ^ oi ~ a/t'^, we deduce that 
k' > k, proving the claim. 

Using the previous claim, Lemma 3.2, and expressing ai — 02 as (ai — ShOi) + {Shai — 02), 
we deduce that oi — 02 <C ai/t. This is a contradiction since by assumption oi ^02- 

Wc prove (ii). Expressing ShOi — 02 as {Shai — ai) + (ai — 02) and using Lemma 3.2 and our 
assumption ai = 02, we see that for every /i G M we have S/jai — 02 -< ai. From this we deduce 
as in the proof of part (i) that ShOi — 02 <C ai/t for every /i G M. It remains to show that if 
Shoai — 02 -< ai/t, then S/^ai — 02 ai/t for every /i / /iq. To sec this, wc express ShOi — 02 
as {ShOi — Sh„ai) + (S'/jj,ai —02), and use that by Lemma 3.2 we have Shai — ai ~ ai/t for 
every non-zero ft. G M. This completes the proof. □ 

4.5. Reducing the type. The next lemma is a key ingredient of the proof of Proposition 4.2. 

Lemma 4.10. Let {A.B) be a nice family of pairs of functions, and suppose that dcg(ai) > 1. 
Then there exist a G ^ U {0} and b £ B, such that for every large enough h £ N, the family 
{a,b,h)-YdC{A,B) is nice and has type strictly smaller than that of {A,B). 

Proof. By assumption, there exists a Hardy field H, functions a,b e Q CiH, and ai, . . . , G 

T{a), bi, . . . ,bm G J'{b), such that A — (ai,...,am)) B — {bi, . . . ,b„i). Given a pair of 
functions (a, b) G {A, B) and ft G N, the family (a, 6, h) -vdC(^, B) is an ordered family of pairs 
of functions, all of them of the form 

{Shai - a, Shbi -b), or (oj - a,bi-b). 

We choose (a, b) as follows: If the family B' , defined by (20), is non-empty, then we take a = 
and let 6 be a function in B' with minimal degree. Then the first row of the matrix type 
remains unchanged, and one easily checks using Lemma 4.9 in the positive degree case and 
Lemma 3.2 in the degree case, that the second row of the matrix type gets "reduced", leading 
to a smaller matrix type for every ft, G N. Suppose now that the family B' is empty, in which 
case all the functions in the family A are unbounded. If A consists of a single function ai, then 
we choose {a,b) := {ai,bi) and the result follows. Therefore, wc can assume that A contains 
a function other than ai. We consider two cases. If = ai for i = 2, . . . ,m, then we choose 
(a, 5) := (ai,5i). Otherwise, we choose (a, 6) G {A,B) such that a ^ ai and a be a function in 
A' (see (19)), with minimal degree (such a choice exists since oi has the highest degree in A). 
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In all cases, for every /i G N, one checks using Lemmas 3.2 and 4.9 that the first row of the 
matrix type of {a,b,h)-YdC{A,B) is "smaller" than that of {A,B), and as a consequence the 
new family has strictly smaller type. 

It remains to verify that for every large enough /i G N the ordered family of pairs of functions 
(a, b, h) -vdC(^, B) is nice. We remark that, by construction, the first pair of functions in this 
family is {S^ai — a, S^hi — 6). 

Claim. Property (1) of Definition J^.l holds for all large enough h eN. 

To prove the first part of Property (1) it suffices to show that for all large enough h 

SfiQi — SfiQi — >■ oo for z = 2, . . . , m 

and 

Sfitti — aj — 7- oo for = 1, . . . , m. 
The first property follows immediately from our assumption ai — — )• oo for i = 1, . . . ,m, 
and the second property follows upon observing that for all large enough /i G N we have by 
Lemma 4.9 that ai/t <^ Shai — ai and our assumption deg(ai) > 1 which combined with the 
property ai ^ Q gives that t -< ai. 

To prove the second part of Property (1) it suffices to show that for all large enough /i G N 

ShQi — a <^ Shai — a, for z = 1, . . . , m 

and 

Oj — a <C Shtti — a, for z = 1, . . . , m. 
Wc only prove the first property, the second can be proved in a similar fashion. We consider 
two cases. If a ^ ai, then by Lemma 4.9 for all but one ft. G N we have Shai — a ^ ai, and the 
estimate follows by our assumption Oj <C ai for i = 1, . . . , m. If a = ai, then by construction 
= for i = 1, . . . ,m. Therefore, for all large enough h E N we have by Lemma 4.9 that 
Shai — a ~ oi/t for i = 1, . . . , m. The result follows. 

Claim. Property (2) of Definition J^.l holds for all large enough h EN. 

It suffices to show that for all large enough /i G N 

Shh — b -< Shai — a, for ? = 1, . . . , m 

and 

bi — b -< Shai — a, for i = 1, . . . ,m. 
We only prove the first property, the second one can be proved in a similar fashion. We consider 

two cases. 

If a ^ oi, then by Lemma 4.9 for all but one /i G N we have Shai — a ~ ai, and so the result 
follows since by assumption bi -< ai for i = 1, . . . ,m. 

If a = ai, then by construction (o, 6) = (ai,6i) and a = a, for z = 1, . . . , m. It therefore 
remains to show that for all large enough ft G N we have Shh — 6i -< ShOi — oi for i = 1, . . . , m. 
To see this, we express Shh — bi as {Shh — bi) + {bi — bi). If 1 -< bi, then bi E G (by Lemma 4.8) 
and Lemma 3.2 gives that for every h E N we have Shbi — bi <^ bi/t -< ai/t. If bi <^ 1, then 
since t -< ai we still get Shbi — bi ^ ai/t. Furthermore, for i = 2, . . . , m, by assumption we 
have bi — bi -< Oj — ai and by Lemma 4.9 we have Oj — oi <C ai/t. Combining the above we get 
for every ft G N that Shh — bi -< ai/t ioi i = 1, . . . ,m. Since by Lemma 3.2 for every ft G N we 
have Shai — ai ^ ai/t, the result follows. 
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Claim. Property (3) of Definition ^.7 holds for all large enough h. 
Equivalently, we claim that for all large enough /i € N 

Shbi - Shh -< Shai - ShGi, for i = 2, . . . , m, 

and 

Shh -bi -< Shtti - tti, for z = 1, . . . , m. 

The first property follows immediately from our hypothesis bi — bi -< ai — for i = 2, . . . ,m. We 
verify the second property. If ^ oi, then by Lemma 4.9 we have for all large enough /i G N 
that SfiQi — ~ ai for i = 2, . . . , m. The desired estimate now follows since by hypothesis 
6i ^ ai for z = 1, . . . , m. Suppose now that = ai. Then Lemma 4.9 gives for all large enough 
h that SfiO-i — ~ ai/t. So it remains to verify that for every large enough /i € N we have 
Shbi — bi-< ai/t. To see this we express S^bi — bi as {Shbi — bi) + (6i — bi). Our assumptions and 
Lemma 3.2 give that Shbi — 6i ~ bi/t ~< ai/t for all /t G N. Furthermore, our assumptions and 
Lemma 4.9 give that 5i — 6^ -< oi — <C ai/t. Hence, for every /i G N we have Shbi — bi -< ai/t, 
as desired. This completes the proof. □ 

4.6. Some ergodic estimates. We gather here some simple ergodic estimates that will be 
used in the proof of Proposition 4.2. 

Using successive applications of Lemma 3.7 one can show the following (see for example Case 
1 of Proposition 5.3 in [16]): 

Lemma 4.11. Let {X,X, ijl,T) be a system, /i,...,/m € L°°{lj) be functions bounded by 1, 
and ai, . . . ,am be non-zero integers such that ai ^ ai for i = 2,...m. Then there exists 
C 



Cm,a2,---,am SUch that 



lim sup 

jV-M^oo II/2I 



sup 

oo'--->ll/rr 



<1 



1 



N 



N-M 



n=M i=l 



<t^lll/l|||2r„,T. 



The next two lemmas will help us handle bounded error terms that later on appear on the 
iterates of the transformations involved. 

Lemma 4.12. Let {X, X , fi,Ti, . . . , T^) be a system, fi, ■ ■ ■ , fm G L°^{p) be functions, and for 
i = 1, . . . ,m, j = 1, . . . ,i, let {ai j{n)) be sequences with integer values. Then for every N gN 
(22) 

N m 



sup 



lE{n) 

n=l i=l 

where T := T x T , jl := jj, x ji, and f '■= f ® f ■ 
Proof. Letting 

m 



< 



N m 
n=l i=l 



■■■T 



)fi, 



we see that the left hand side in (22) is bounded by 

1 

iV2 



E 



l<m,n<N 



F ■ F 



dfl 
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It follows that the square of the left hand side in (22) is bounded by 



- y 



l<m,n<N 



- y 

l<m,n<N 



Gn ■ Gm dfi 



1 ^ 



n=l 



where 



Gn := 11(^1 



■■■->-( )Ji- 



1=1 



This completes the proof. 

We deduce from this the following: 



□ 



Lemma 4.13. Let {X, X , ii,Ti, . . . ,Tg) be a system, /i,...,/m G L°^{n) be functions, and 
for i = l,...,m, j = !,...,£, let (aij(n)) be sequences with integer values and {eij{n)) be 
sequences that take values in some finite set of integers F. Then for every iV G N 



sup 



^ N m 

_ TT|-/jiii,i(")+ei,i(«) 



AT ^11 

n=l i=l 



rpai^e{n)+ei^e{n)^j. ^ ,^ 

■■■J-e )Ji-i-E[n) 



< 



IFP^"^ • max 



/V in 



where f := T x T , ft := x ^i, and f f ® f ■ 
Proof. The norm on left hand side is less than 



fiai,i(")+Cf,i 



■ J-f )Ji 



n=l i=l 



N m 



n=l i=l 



where the sets Ei,. . . ,Et {t < form a partition of E into sets where the sequences Cij 

are all constant. The desired estimate is now an immediate consequence of Lemma 4.12. □ 

4.7. Proof of Proposition 4.2. Wc start with an elementary lemma that will be used to 
prove seminorm estimates in the case where all the iterates have sub-linear growth. 

Lemma 4.14. Let a: [c, oo) M be a positive Hardy field function that satisfies the growth 
condition logt -< a{t) -< t and {A{n)) be a bounded sequence in a normed space such that 



limAr_M- 



N 



0. Then limAr. 



0. 



Remark. When t^ -< a{t) -< 1 for some £ > 0, the conclusion holds under the weaker assumption 



lim 



: 0. 

{k G N: [a{k)] = n} and W{N) = w(l) + ••• + w{N), it suffices 
W{N) En=i ^'l^) ■ A{n) = 0. Letting b{t) = a~^{t), one checks that 
w{n) / {b(n + l) — b{n)) — > 1 and W{n)/b{n) — t- 1. Our assumptions give that log(6(t)) -< t ~i b{t). 
This implies that b(t + 1) — b(t) — )■ oo and {b{t + 1) — b{t))/b{t) — )• 0. Hence, w{n) — )• oo and 
w{n)/W{n) 0. The needed convergence to now follows from Theorem 3.6 in [5]. □ 



Proof. Letting w{n) = 
to show that limjv^oo 
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We are now in position to prove Proposition 4.2. Given a Hardy field H and functions 
a,b e Q CiH our goal is to establish the following claim: 

Claim: Let G T{a), hi G T{b) for z = 1, . . . , m, and {A, B) be a nice family of ordered pairs 
of functions where A ■= (oi, . . . , a„i), B := (hi, ... , bm)- Let W be the matrix type of this 
family. Then there exists k = k{W,m) G N such that: If |||/i|||A;,ri = 0, then the averages 



(23) ^x:n(T^(")iri'''('^)i)/. 

n=l i=l 

converge to in L^{ijl). 

Note that the conclusion of Proposition 4.2 is somewhat stronger in two respects: (i) The 
integer k depends only on the degree of the family. This strengthening easily follows from 
the above mentioned claim after noticing that there is only a finite number of possible matrix 
types for families that have fixed degree and numbers of pairs of functions, (ii) The conclusion 
involves a supremum over all subsets of N. This strengthening follows by combining the above 
mentioned statement with Lemma 4.12 and the fact that |||/|||fc+i,T = implies that |||/ 
f\lk,TxT = (this follows from (15)). 

We proceed to prove the claim by induction on the type of the nice family {A,B). 

Base Case: Suppose that deg(ai) = 0, in which case, for i = 1, . . . ,m the functions Oj and bi 
have sub-linear growth. We are going to show that if |||/i|||2m+i,ri = 0, then the averages (23) 
converge to in L?'{ii). 

Our assumption implies that for i = 2, . . . , m one has 

ai{t) = aiai{t) + Ci{t) 

for some ctj G M and functions Cj that satisfy Cj -< ai. It is important to note that aj / 1 for 
i = 2, . . .m. Otherwise ai — ^ ai, and since ai — G T{a) and deg(ai) = 0, we deduce by 
Lemma 4.8 that oi — Oj ^ 0, contradicting our assumption that the family [A, B) is nice. Let 

bi := bi o a^'^ , Ci := Ci o a^'^ . 

(We caution the reader that these functions are not necessarily Hardy field functions.) Since 
bi -< oi and Cj ^ ai we have bi ^1 and q ^ 1. Furthermore, one sees that 

[ai(n)] = [Q!i[ai(n)]] + [ci([ai(n)])] + ei{n), [6i(n)] = [6i([ai(n)])] + e-(n), 

where the sequences (ej(n)), (e^(n)) take finitely many integer values. Therefore, it suffices to 
show that the averages in n of 

. JJ(^2jai[ai(n)]]+[ci([ai(n)])]+e,(n)^J6i([ai(n)])]+e^(n)^^ 
i=2 

converge to in L'^i/J,). 

By Lemma 4.13 it suffices to show that the averages in n of 

^j,[ai (n)] ( [ai (n)] )] -J _ j"!" ^ji[ai [ai {«)]]+ [ci ( [ai (n)] )] ^jS, ( [ai (n)] )] ^ 

1=2 
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converge to in LP-{fi) for all /j G L°°{jl),i = 2, . . . ,m, where T := T x T, /i := /x x //, and 
/ := /(g)/. Using Lemma 4.14 we can further reduce matters to showing that for every sequence 
(7jv) of intervals of integers with lengths increasing to infinity, the averages 

l^^lrit ' ' i=2 ' 

converge to in L^(/x) as N ^ oo. 

Using our assumptions, one easily sees that the functions Cj(t + 1) — Ci{t) and + 1) — bi{t) 
converge to and have eventually constant sign. Because of this, it is possible to decom- 
pose each interval In (except a finite set with fixed cardinality) into sub-intervals with length 
tending to infinity, and such that for every iV G N the sequences ([c2(n)]), . . . , ([cot('T')]) and 
([6i(n)]), . . . , i[bmin)]) arc constant on each interval. Thus, without loss of generality we can 
assume that all these sequences are constant in each interval Ijv- Then the desired fact would 
follow if we prove that the averages 

^ m 

^ \ rf,n f 'TT /rT-i[oiin]+CiN ^diN s 7 

Tj^^l^T^h- i\iT^ ^2 )U 

n(z-lN i=2 

converge to in Lp'^jj) as — > oo, for every choice of integers Ci^N,di^N- This follows form 
Lemma 4.11 and the fact that |||/i|||2m+i,Ti = implies that |||/j|||2mf' ~ ^• 

Inductive step: Let now (A, B) be a nice family of m ordered pairs of functions, of matrix 
type and such that deg(ai) > 1. Suppose that the statement we want to prove holds 
for every nice family of 2m ordered pairs of functions with matrix type W' strictly less than 
W (there is a finite number of such families), and let k{W',2m) be the integer for which the 
conclusion of the corresponding statement holds. We let k{W, m) = m.ax.wi<w{k{W' , 2m)) + 1. 
Our goal is to show that k{W, m) works for the family (^, B). Since in the base case we covered 
all nice families with degree 0, this is going to complete the induction. 

So assuming that |||/i|||fe(VK,m),Ti = 0) we want to show that the averages (23) converge to 
in L'^i/J,). By Lemma 3.7 it suffices to show that for large enough h eN the averages in n of 

/m 
JJ^j.|ai(n+/i)]j,^6i(n+/i)]-jy.^ _ ^rp[ai(n)]rplbi{n)],^j^ 

i=l 

converge to 0. We compose with T-^ j"^ ^ ^hcrc (a, b) G (A, B) is chosen as in Lemma 4.10, 
and use the Cauchy-Schwarz inequality. This reduces matters to showing that for every large 
enough /i G N the averages in n of 

m _ 

'^^^j,[ai{n+h)-d{n)]+ei^i{n)j,[bi(n+h)-b{n)]+e2,i{n)^j.^ _ ^j^[ai{n)-d{n)]+e3^i{n)rp[bi{n)-b{n)]+e4^i{n)s^^^ 
i=l 

converge to in -L^(//) where Cjj are sequences that take values in the set {0, 1}. By Lemma 4.13 
it suffices to show that the averages in n of 

■|~|'^j.K("+'j)-a(ri)]+ci,i^[6i(n+fe)-6(n)]+C2,i-jj- _ |-j,[ai(n)-a(n)]+C3,i^[6i(n)-6(n)]-|-C4,<^ j 
i=l 
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converge to in L^{jl), where, Cij arc constants with values either or 1, ci^i = C2,i = 0, and 
T := T X T, jl := X jji, f '■= f ® f- We remove the functions that happen to be composed 
with eventually constant iterates of Ti and T2 (this will happen when the functions involved arc 
bounded), since they do not affect convergence to 0. This corresponds to the operation * defined 
in Section 4.2.2, and the resulting multiple ergodic averages are associated with the families of 
functions (a, 6, h) -vdC(.4, B). Our final goal is to show that these averages convergence to in 
L^(/i) for every large enough /i G N. 

By Lemma 4.10, for every large enough /i G N, the family [a.h.h) -YdC{A^B) is nice, has 
type W strictly smaller than W , and its first pair is ([ai(n + h) — a(n)], [61 (n + h) — h{n)]). 
Notice also that in (24) the iterate ig applied to the function /i. 

Since |||/i|||A;(w^,m),ri = implies that 2m) fi ~ ^' induction hypothesis applies and 

proves convergence to in L^{n). This completes the proof of Proposition 4.2. 

5. Seminorm estimates for the highest degree iterate: The general case 

The next proposition is the generalization of Proposition 4.2 to the case of an arbitrary 
number of transformations. To avoid unnecessary repetition, we define the concepts needed in 
the proof of Proposition 5.1, and then only summarize its proof providing details only when 
non-trivial modifications of the arguments used in the previous section are needed. 

Proposition 5.1. Let {X, A! , fi,Ti, . . . ,Tg) be a system, and fi,...,fm G L°°{iJ,). Suppose 
that {Ai, . . . ,Ae) is a nice ordered family of i-tuples of functions with degree d (all notions are 
defined below). Then there exists k = k{d,i,m) G N such that: If |||/i|||fc,ri = 0, then 



lim sup 



-. N m 



N 

n=l i=l 



= 0. 



Applying this result to the nice family {Ai,...,A() where Ai '■= (ai, 0, . . . , 0), A2 '■= 
(0, 02, ... , 0), ... Ai, := (0, . . . , 0, ae), we get: 

Proposition 5.2. Let {X, X , n,Ti, . . . ,Ti) be a system, and fi,...,fi G L°^{^) be functions. 
Let T-L be a Hardy field and ai, . . . G Q CiTi be functions with different growth and highest 
degree d := deg(ai). Then there exists k = k{d,£) such that: If |||/i|||fe,Ti = 0, then the averages 



N e 
1 ^jj^KH]^^ 



N 

n=l i=l 

converge to in L'^{jx). 

5.1. Families of ^-tuples and their types. 

5.1.1. Families of ^-tuples of functions. Let £, m G N. Given t ordered families of functions 

Ai ■■= (ai,i, . . . ,ai,^), ...,Ae-= (a^,i, ■ ■ • ,o^,m) 
we define an ordered family of i-tuples of functions as follows 

{Ai, Ae) ■= ((ai,i, ■ ■ ■ , ae,i), • • • , {ai,m, a«,m)) • 
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The maximum of the degrees of the functions in the famihes Ai, . . . ,Ae is called the degree of 
the family {Ai, ■ ■ ■ , Ae). 

For convenience of exposition, if ^-tuples of bounded functions appear in (^i, . . . ,Ai) we 
remove them, and henceforth we assume: 

All families {Ai, . . . , Ai) that we consider do not contain i-tuples of bounded functions. 

5.1.2. Definition of type. We fix d > and restrict ourselves to families of degree between 
and d. We define 

A'l ■= {aij € Ai : ai J is not bounded } 

and for i = 2, . . . ,£ 

A'i := {aij G Ai'. aij is not bounded and ai'j is bounded for i' < i}. 

For i = !,...,£ and j = 0,1, ... ,d, we let Wij be the number of distinct non-equivalent 
classes of functions of degree j in the family A[. We define the (matrix) type of the family 
(^1, . . . , Ai) to be the matrix 

/ wi^d ■ ■ ■ m,o\ 

W2,d ■ ■ ■ m,0 

\we^d ■ ■ ■ wefl J 

As in Section 4.1.3, we order these types lexicographically. The following extension of 
Lemma 4.5 holds: 

Lemma 5.3. Every decreasing sequence of types of families of £ -tuples is stationary. 
5.2. Nice families and the van der Corput operation. 

5.2.1. Nice families. Henceforth, we are going to work with families of ^-tuples of functions 
that satisfy the following properties: 

Definition 5.4. Let "H be a Hardy field, ai,...,a^ G Q CiTi be functions, Ojj £ J^{ai) for 
i = !,...,£, j = I,..., m, and Ai := (ai,i, . . .,ai^rn),- ■ Ai := (a^,i, . . .,ae^m)- We call the 
ordered family (^i, . . . ,Ai) of ^-tuples of functions nice if 

(1) ai^i — aij >- 1 and aij <C ai^i for j = 2, . . . , m; 

(2) Qij -< ai,i ioi i = 2,...,£, j = l,...,m; 

(3) aj,i - aij -< ai,i - aij ioi i = 2, . . . ,£, j = 2, . . . ,m. 

5.2.2. The van der Corput operation. Given a family A '■= (ai, . . . , a^) , a function a: [c, oo) — >■ 

M, and /i G N, we define 

ShA := {ShOi, ShOm) and A - a:= (ai - a, . . . , a™ - a) . 

Given a family of ^-tuples of functions (.4i, . . . , Ai), an ^-tuple (ai, . . . , a^) G (.4i, Ai), and 

/i G N, we define the following operation 

{ai,...,ai,h)-YdC{Ai,...,Ae) := {Ai^h, ■ ■ ■ Ai^h)* 

where 

Ai^h ■= {SfiAi a,i,Ai flj). 
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for i = 1, . . . , and * is the operation that removes all ^-tuples that consist of bounded functions 
from a given family of ^-tuples of functions. 

5.3. Reducing the type. The next lemma enables us to reduce the type of a nice family of 
^-tuples that has positive degree: 

Lemma 5.5. Let {Ai, ■ ■ ■ , Ai) be a nice family ofi-tuples of functions with deg(ai,i) > 1. Then 
there exists (ai, . . . ,di) € (^i U {0}, . . . , ,4^ U {0}) such that for every large enough h E N the 
family (ai, . . . , a^, /i) -vdC(.4i, . . . , Ai) is nice and has strictly smaller type than {Ai, . ■ ■ , Ai). 

Proof. Let Ai := (oj^i, . . . , Oi^m) for i = 1, . . . ,i. Let i G {1, . . . ,i} be the largest integer such 
that the family A'^ is non-empty. We choose (ai, . . . ,a^) as follows: If i / 1 (in which case 
A'f, A'f_i, ■ ■ ■ , A'^+i are empty, and A[ is non-empty), then we take ai = • • • = Oj-i = and let 
Oj to be a function of minimal degree in A[. Then for every /i G N, one checks using Lemmas 3.2 
and 4.9 that the first i — 1 rows of the matrix type remain unchanged, and the i-the row will 
get "reduced" , leading to a smaller matrix type. 

If z = 1, then the families A'(,A'£_i, . . . ,A'2 are all empty. If Ai consists of a single func- 
tion, namely ai^i, then we choose (ai, . . . , a^) := (ai,i, . . . and the result follows using 
Lemma 3.2. Therefore, we can assume that Ai contains some function other than oi^i. We 
consider two cases. If o = ai^i for all a G Ai, then we choose (oi, . . . , an) := (ai^i, . . . , a^^i). 
Otherwise, we choose (ai, . . . , ag) G (^i, . . . , Ai) with ai ^ oi^i, and such that ai is a function 
in A!i with minimal degree (such a choice exists since ai^i has the highest degree in ^i). In all 
cases, for every G N, one checks using Lemmas 3.2 and 4.9 that the first row of the matrix 
type of (oi, . . . , a^, h) -vdC(^i, . . . , Ai) is "smaller" than that of (^i, . . . , Ai). 

It remains to verify that for large enough /i G N the family (ai, . . . , a^, h) -vdC(^i, . . . , Ai) 
is nice. This argument is very similar to the one used in Lemma 4.10 and so we omit it. □ 

5.4. Proof of Proposition 5.1. Proposition 5.1 is proved by an induction on the type of nice 
families of ^-tuples of functions. The base case covers all families with degree and is proved 
in a way completely analogous to the case £ = 2, that was treated in the previous section. The 
inductive step is also completely analogous to the case 1 = 2 and is omitted. 



In order to motivate the estimates that are proved in this section we recap part of our plan 
for studying the limiting behavior of the averages 



when 02 -< ai. We showed in Proposition 4.1 that there exists d G N such that if |||/i|||d,Ti = 0, 
then the averages (25) converge to in L'^(n). Our goal is to prove a similar result for the 
function /2. Using the decomposition result of Proposition 3.4 we can reduce matters to showing 
that there exists d G N such that if |||/2|||d,T2 = 0) then 



6. Correlation estimates 



(25) 




n=l 



^El?.([ai(n)])./2(r[' 



,[a2(n)l 



n=l 
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where (Vxin)) is a uniformly bounded sequence of measurable functions such that for almost 
every x e X the sequence (Vnix)) is a dual sequence of level at most d. This motivates us to 
seek for estimates that connect averages of the form 

1 ^ 

-^V{[a{n)]) . A{n) 

n=l 

where {V{n)) is a dual sequence, and averages involving only product of translates of the 
sequence {A{n)). We produce such estimates in this section. 

6.1. Correlation estimates for sequences. 

Proposition 6.1. Let H be a Hardy field and bi, . . . ,bi E H be functions with maximum degree 

d > —1. Let {X,X,fi) be a probability space, {A(n)), ('Di(n)), . . . , (T>i(n)) be uniformly bounded 
sequences of L°°{ii) functions, such that for almost every x & X, for i = 1, . . . ,1, the sequences 
(Vi^xin)) are dual sequences of level at most r G N. Then there exists sq = SQ(d,l,r) € N and 
C = C(d, / , r) G M such that for some s < sq we have 



lim sup 



n=l 
Hs 



< 



L2(m) 



C-limsup — 2, ■ ■ ■ lim sup — — >, lim sup sup 

/ii=l 



where h := 



hs=l 

{hi, . . .,hs). 



N 



]vE n cN^.(n + e-h) 

n=lee{0,l}« 



lE(n) 



Remark. Notice that we do not have to assume that bi, . . . ,bi G Q. When 1 = 1 and 5i(t) = t 
the result was proved in [27]. 

Proof. To begin with, using identity (16), we see that there exist A;,£ G N (in fact, k = Ir and 
£ = l{2^ — 1)), vector valued sequences of functions bi, . . . ,b^: [c, oo) — )■ M*', with coordinates 
functions bij taken from the set {0, 6i, . . . , 6;}, and sequences di,...,d£: N'^ ^ L°^{fi), such 
that 

I ^ i 

i=l me[l,M]'= i=l 

where [hi] := ([6i,i], . . . , [bi^k]) and m := (mi, . . . ,mfe). Furthermore, all functions bij and dj 
are bounded by 1. It therefore suffices to prove the following claim: 

Claim: Let k, I E N, Hhe a Hardy field, and for i = 1, . . . , f, let bj = (&j,i, . . . , where bij G 
H. are functions with maximum degree d > —1. Furthermore, let {A{n)), (di(m)), . . . , (d^(m)), 
m G N*^, be sequences of (/x) functions, all bounded by 1. Then there exists sq = so{d, k, t) G 
N such that for some s < sq the expression 
(26) 

N t. 

J^Y^^Mn)-^ E nd.x(m+[b.(n)])-li.(n)) 



lim sup 

JV->oo lldi 



sup lim sup 

<l,E(zn M^oo 
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is bounded by a constant C = C{d, k,i) times 



1 

lim sup — • 



1 

• lim sup — > lim sup sup 



N 



U CN^.(n + e-h).Mn) 

n=lee{0,l}'' 



where [hi] := . . . , [bi^k]) and h := {hi, hg). 

Equivalently, it suffices to prove the same estimate with the left hand side replaced with 



lim sup lim sup 



1 ^ 

N ^ 



1 



me[l,M]'= j=l 



where for iV G N the sequences of functions di^jv^ • • • j ^t,N '■ ~^ L'^{fi) are bounded by 1. 

For i = l,...,k, let = {bi^i, . . . ,bi^i), and define the matrix type W of the family of 
A;-tuples {Ai, . . . ,Ak) as in Section 5.1. Notice that having fixed d, k,£, there is only a finite 
number of possibilities for W. The proof of the claim is going to proceed by induction on W. 
We remark that it suffices to show that the constants C and s depend only on W, k, and £. 
Furthermore, we can assume that 6i i is the function with the largest growth rate. 
Base case: We assume that d = —1, in which case all functions bij{t) converge to 0. Then 
for i = I, . . . , i, for all large enough n G N the sequence [b,j] takes values on some finite subset 
Fj C with < 2^^. Without loss of generality we can assume that this happens for every 
n G N. Let En,i, ■ ■ ■ ^Ej^f^i {t < 2^^) be sets that form a partition of Sj^ into sets where all 
the sequences are constant. Then there exist constants |cj,jv| < 1 such that for s = the 
quantity we want to estimate is equal to 



lim sup 



^ N 



3=1 



N ^ 



in) 1e 



[n, 



< t lim sup sup 

N-^oo BCN 



1 ^ 



Inductive step: Let (^i, . . . ,Ak) be a family of £ ordered /c-tuplcs of functions with matrix 
type W and degree d > 0, in which case deg(6i^i) > 0. Suppose that the claim holds for 
every family of 2£ ordered A;-tuples of functions of matrix type W strictly less than W with 
so = sq{W', k, 2i) and C = C(W', k, 2i). We let 
(27) 

so{W, k, £) = max {so{W', k,2e)) + l, C{W, k, £) = 2(2^=^+1)2^°^'^'*''' 1 {C{W', k, 21)) 



where the max is taken over the finitely many matrix types of families of at most 2i functions 
that are smaller than W. The induction will be complete if we show that the asserted estimate 
holds for the family (^1 , . . . , Ak) for these values of s{W, k, t) and C{W, k, £). 
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We start by using the Cauchy Schwarz inequality 

N I 



(28) lim sup lim sup 
N^oo M-^oo 



n=l 



me[l,M]'= n=l 



me[l,M]* j=l 

N e 



< 



lim sup lim sup ^ ^ ^ {A^in) • di,^,;v(m + [bj(n)]) • 



i=l 



L2(m) 
2 



Using Lemma 3.6, ignoring negligible terms, and using the Cauchy Schwarz inequality, we find 
that the last expression is bounded by 2 times 



H 



N 



lim sup — lim sup lim sup 11 — (Ax {n + h) ■ Ax (n) • 

1 ^ 

■j^ XI \\di^x,N{rn+[hi{n + h)])-di^xM^+\^ii'^^^^ 

me[l,M]'= i=l 

We make the change of variables m — ^ m — [b(n)], for some vector valued function b that will 
be determined later. Ignoring negligible terms, we see that the last expression is equal to 

I ^ II 1 ^ 

(29) lim sup — lim sup lim sup | — {^Ax{n + h) ■ Ax{n)- 

h=l n=l 

1 ^ 

J2 H di,c.,jv(m+ [hi {n+h)-h{n)] +ei,h {n))-di,x,N (m+ [b^ (n) -b(n)]+e^(n)) -Ie^v.^ («)) 

mefl.M]* i=l 

where the sequences (€i^/i(n)), (e'^ ^(n)) take values in {0, l}'"' and Ejq^)^ := E^at n {Ej^ — h). 
Notice that 

i 

n di,a;,Ar(m + [bi (ra + h)- b(n)] + ei,ft(n)) • dj,a;,Ar(m + [bi(n) - b(ra)] + e-^ft(n)) • 1^^^ (ra) = 

1=1 

t ^ 

n^*J>'^(™ + [^^(^ + ^) ~ ^(^)]) ■ '^i,j>,iv(m + [bi(n) - b(n)]) • lB^_^_.(n), 

where the sets -Eat,/!,!, • • • , -E^Af,/*,* (^ < 2^^^'^) form a partition of -EAr,/i into sets where the sequences 
ei,e\ are constant (either or 1), djj^Af(m) '■= di,N{T^ + Cj), and d^^jyl™) •= dj,Af(m + c^). 
Combining the above we get that the limit in (29) is bound by 



1 " 

t ■ lim sup — X^ lim 



II 1 ^ 

lim sup {Ax{n + h) ■ Ax{n)- 



1 sup sup ^ 

H^oo ^ f^^i N^oo ||di||^,||d^||^<l,i=;cN M^oo " ^ „=i 

1 ^ 

U + + ^) - bH]) • d^,.(ni + Mn) - b(n)]) ■ ls(n)) 

me[l,M]* j=l 
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This naturally leads us to consider a new family that consist of 2£ ordered /c-tuples of 
functions. Choosing b exactly as in the proof of Lemma 5.5, and following the argument used 
there, we see that this new family has matrix type W' strictly smaller than W. 

For this choice of b, raising both sides of (28) to the power 2*(^''^'^^\ working through the 
previous estimates (we use also the Holder inequality at the last step) , and using the induction 
hypothesis, we get that for s = s{W', k, 21) + 1 and C = C{W, k, £), defined as in (27), the left 
hand side in (26) is bounded by C times 

lim sup — 2, 1™ sup — y ■ ■ ■ lim sup — — 2, li™ sup 

h=l hs = ^ hi=l 
II 1 ^ 

supLry IT Cl^U^(n + ^ + e-h)-Cl^l^(n + e-n)-lE(n) 

T^^^T i| TV ■'^ J-J- L'^M 



where h = {h\, . . . ,hs)- The last expression is equal to 

H ^ Hs t Hi .. -, N 

hi=l "-^^ n=lee{0,iy+^ 

where h = {hi, . . . ,hs,h), as desired. □ 



^ H ^ Hs 1 ^1 II 1 ^ 

limsup — V— V---limsup— V limsupsup — V TT C^^^Ax{n+e-h)-lE{r 



6.2. Correlation estimates for ergodic averages. Next we combine Proposition 6.1 with 
Proposition 5.1 in order to prove a result that will be crucial in the proof of Theorem 2.3. 

Proposition 6.2. Let {X, X , iJ,,Ti, . . . , Tg) he a system and fi, ■ ■ ■ , fm G -^^°°(m) be functions. 
Let {Ai, . . . ,Ae) be a nice ordered family of i -tuples of functions with degree at most d and 
such that dcg(ai^i) > 1. Let % he a Hardy field and bi, . . . ,bi £ Ti he functions with maximum 
degree d. Furthermore, fori = l,...,l, let (I?j(n)) he a sequence of functions in L°°{fj,), all 
bounded by 1, such that for almost every x E X, the sequences (Vi^xin)) are dual sequences of 
level at most r G N. Then there exists k = k{d,l,i,m,r) G N such that: If |||/i|||fe,ri = 0, then 
the averages 

^ N m I 

(30) 1 5: n /.(^l"^''^"^' • • • r^''^"^'x) . \{VUUn)]) 

n=l i=l i=l 

converge to in L'^{jj,). 

Proof. Let s := s{d, I, r) be as in the statement of Proposition 6.1. We assume that |||/i|||fc,Ti = 
where k := k{d,£, 2^£) is given by Proposition 5.1. We let s' := 2*, and for x E X, let {Ax{n)) 
be the sequence of L°°{^) functions defined by 



i=l 

For i = 1, . . . ,£, consider the following ordered families each consisting of ms' functions: 
A'i •■= (ai,i(n + ri), . . . , ai,i(n + r^'), . . . , ai,^(n + n), . . . , ai,^(n + r^')) ■ 
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Since dcg(oi_i) > 1 and (^i, . . . ,Ae) is a nice ordered family, one can check using Lemma 4.9 
that {A'l, ... ,A'^) is also a nice ordered family for all r in a subset R CW of the form 

R:={r = (n, . . .,rs'): n > ci, r2 > C2(ri), . . . , r^/ > Cs'in, . . .,rs'-i)} 

for some sequences Cj : — >■ N. Using Proposition 5.1 we have that 



lim sup 

N-^ooEcN 



^ N s' 



n=l 1=1 







for all r e R. Furthermore, a similar conclusion holds if one replaces some of the sequences of 
functions {A{n + ri))nef^ with their complex conjugates. 

Hence, for a set of h G that has similar structure as R, we have 



lim sup 

iV->oo Ecf^ 



1 ^ 

n CN^.(n + e.h).l, 



in) 



n=lee{0,l}'' 



0. 



We deduce from Proposition 6.1 that the averages (30) converge to in L'^{n), as desired. □ 

7. SeMINORM ESTIMATES FOR THE LOWER DEGREE ITERATES AND PROOF OF 

CONVERGENCE 

In this section we prove Theorem 2.3. We first handle the case where all the iterates have 
super-linear growth, and later on use an averaging trick to handle the general case. 

7.1. Seminorm estimates in the positive degree case. 

Proposition 7.1. Let {X,X, iJ,,Ti, . . . ,T() be a system and /i, . . . ,/f G L°°{lj) he functions. 
Let % he a Hardy field and ai,. . . ,a£ £ Q CiH be functions with different growth rates and 
degree between 1 and d for some d € N. Then there exists k = k{d,£) such that the following 
holds: If |||/i|||fc,ri = for some i G {1, . . . ,£}, then the averages 

N 



n=l i=l 



ai{n)] 



fi 



converge to in L^{fi). 

Proposition 7.1 follows from the following more general result: 

Proposition 7.2. Let {X, Af, Ti, . . . , T^) he a system and /i, . . . , G he functions. 

Let % he a Hardy field and ai, . . . ,a£ G G CiH be functions with different growth rates and 
degree between 1 and d for some d G N. Furthermore, let bi, . . . ,bi E H have degree at most 

d. For i = 1, . . . ,1, let ('D^,2:(^^))r^eN he a uniformly hounded sequence of measurable functions 
such that, for almost every x & X, the sequence iT>i^x{n))neN is a dual sequence of level at most 
r. Then there exists k = k{d,l,i,r) such that the following holds: If ||| /i ||| fc,Ti = for some 
i E {1, . . . ,£}, then the averages 

N e I 

(31) ^En/^(^i"^"^'^)-n^^,-([^^H]) 



n=l i=l 



converge to in L^ (/i) . 
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Proof. The proof goes by induction on the number £ of transformations. For £ = 1, the result 
follows from the case ^ = 1 of Proposition 6.2. We take i > 2, assume that the results holds 
for £ — 1 transformations, and we are going to prove that it holds for £ transformations. 

Without loss of generality we can assume that 04 is the fastest growing function, and that 
all functions and dual sequences are bounded by 1. By Proposition 6.2, there exists ko = 
ko{d,l,£,r) such that, if |||/i|||fco,Ti = 0, then the averages (31) converge to in L^{n). Let 
k\ '.= k{d,l,£ — l,f) be the integer that the induction hypothesis gives for d := maxjd, /cqIj 
r := max{r, feo}) and I := I + 1. Suppose that ||| /i ||| fc^ ,Ti = for some i G {2,...,£}. The 
induction will be complete if we show that the averages (31) converge to in L^(/i). 

Let e > 0. By Proposition 3.4 we can express /i as /i = fs+fu+fe, where fs, /„, /g G L°^{fi), 
|||/«|||fc(,,Ti = 0, ||/e|lLi(^) < £, and fs = TdLi(^ifs,i, for some m G N, q G R, fsA G L°^{fJ.), and 
for almost every x & X the sequences {fs,i{T^x))neN are dual sequences of level at most kg. As 
we explained before, when computing the limit in L^ifJ-) of the averages (31), the contribution 
of the term /„ is negligible. Furthermore, by the induction hypothesis, the same holds for the 
contribution of the term fg^i, for i = 1, . . . ,m, and as a consequence for the term fg. It remains 
to handle the contribution of the term fg. When fi is replaced by /e, the L^{n) norm of the 
averages (31) can be bounded by 



1 ^ 

n=l 



[ai(n)] 



l/e 



L1(m) 



1 ^ ,, 



[ai(n)] 



l/e 



n=l 



Since e was arbitrary, we deduce that the averages (31) converge to in L^{fi), and as a 
consequence in L/^iji) (since all functions fi are bounded). This completes the proof. □ 

We also record a variant of this result that will be used later. 

Proposition 7.3. Let (X, A", /x, Ti, . . . , T^) be a system and fi, . . . , f^ G L°°{fJ.) be functions. 

Let % be a Hardy fi,eld and ai. . . . ,af G GOT-L be functions with different growth rates and degree 
between 1 and d. Then there exists k = k{d,£) such that the following holds: If |||/i|||fe,ri = 0, 
for some i E {1, . . . ,£}, then 



(32) 



^ N Re 



R^oo Rr=u=i 



0. 



Proof. Suppose that ai is the fastest growing function and all functions are bounded by 1. 
Notice that for every i? G N we have 

2 



-.TV R i 



n=l r=l 1=1 



L^{n) l<ri,r2 



l<ri,r2<R n=l-^ i=l 



For vi 7^ r2, using Proposition 6.2 (the corresponding family of ^-tuples is nice) we get that 
there exists ko = ko{d,£) such that if |||/i|||feo,Ti = 0, then the averages 



(33) 



1 TT rp[ai{Rn+ri)] n rp[ai{Rn+r2)] l 



n=l i=l 
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converge to in L^{l^)- It is then straightforward to adapt the proof of Proposition 7.2 in order 
to get that there exists k = k{d,£) such that for ri ^ r2 , if |||/i|||fe,Tj = 0, then the averages (33) 
converge to in L^{fi). We deduce that for every i? G N we have 

^ N R e ^ 



Taking i? ^ oo we deduce that (32) holds and completes the proof. □ 
7.2. Equidistribution on nilmanifolds. 

Proposition 7.4 ([15]). Let H be a Hardy field and ai,...,a£ G Q H T-L be functions with 
different growth rates and positive degree. For i = 1, ...,£, let Xi := Gi/Vi be nilmanifolds, 
bi G Gi, and Xi E Xi. Then the sequence 

(6["i(")lxi,....?>!'"(")lx^) 



N R 

(35) lim limsupi-j; i5^F(6^(^"+'-)]xi,...,6t'^^(^"+'-)la;,)- F dm, 



is equidistributed on the nilmanifold Y\i=i {bfxi : n EN}. 

For future use we record an identity that follows from the previous result: For all functions 
Fi eC{Xi) we have 

N e e -.AT 

n=l i=l i=l n=l 

We are also going to use another identity. Its proof is essentially contained in [15]. 

Proposition 7.5. Let % be a Hardy field and ai, . . . ,a£ G Q r\T-L be functions with different 
growth rates and positive degree. For i = !,...,£, let Xi := Gi/Ti be nilmanifolds, bi G Gi, 
Xi G Xi, and F G C{X), where X = Xi x ■ ■ ■ x X^. Then 

N . ^ R 
n=l r=l 

where X = H Li IK^i- neN}. 

Sketch of Proof. Using a straightforward modification of the reduction argument of Section 
5.2 in [15], we can reduce matters to proving the following statement: "For i = !,...,£, let 
Xi = Gi/Ti be nilmanifolds, with Gi connected and simply connected, Xi E Xi, bi E Gi act 
ergodically on Xi (meaning the sequence (bfxi) is equidistributed in Xj for every Xj G Xi), and 
F G C{X), where X = Xi x ■ ■ ■ x Xg. Then (35) holds with X in place of X." 

This was verified while proving Proposition 5.3 in [15], completing the proof. □ 

7.3. Proof of Theorem 2.3 in the positive degree case. 

Proposition 7.6. Theorem 2.3 holds when all functions ai, . . . , have positive degree. 

Proof. We want to show that for every system (X, B, iJ,,Ti, . . . , T^) and functions fi, . . . , fe £ 
L°°{lj,), we have 

N e £ 

(36) ^^^^EUTt-"'"n = W> 

n=l i=l i=l 
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where converge is taken in L/^^fi) and /j := E(/j|Xj'.). By Proposition 7.2 there exists k such 
that, if |||/j|||fc,Ti = for some i G {!,...,£}, then the hmit of the averages in (36) is where 
convergence takes place in (and hence in L^{n) as well). 

Let e > 0. By Proposition 3.4, for i = 1, . . . , ^ we can write /j = fi^s + fi,u + fi,e, where 
fi,s,fi,uji,e G L°°{lJ-), |||/i,«|||fc,Ti = 0, ||/i,e|lz,i(^) < £, and fi^s G -^°°(/^) are such that for almost 
every x e X the sequence {fi^s{Tpx)) is a k-step nilsequence, say {Mi^x{n))- As we explained 
before, when computing the limit in L^{n) of the averages in (36), the contribution of the 
terms fi^u is negligible. Furthermore, the same holds for the contribution of the terms /j^e- 
This follows since for every AT G N the L^{ijl) norm of the averages in (36) is bounded by a 
constant multiple of 



1 ^ M 



Therefore, it remains to examine the contribution of the terms fi^g- In this case, the average 
in (36) takes the form 

N e 

n=l i=l 

Using identity (34) we get that the limit of this average is 

e , N 



nlim ^"S^ J^ixin) 
i=l n=l 



which in turn is equal to 



N 



i=\ n=l 

For reasons explained before this is equal, up to a constant multiple of £, to 

i=l n=l 1=1 

Letting e ^ completes the proof. □ 

The proof of the next result is completely analogous to the proof of Proposition 7.6, one uses 
Proposition 7.3 in place of Proposition 7.2 and Proposition 7.5 in place of Proposition 7.4 

Proposition 7.7. Let {X, X, iJ.,Ti, . . . , Tg) be a system and fi,. . . , fg G L°°{l-i) be functions. 
Let % he a Hardy field and ai , . . . , G Q ^~\'H he functions with different growth rates and 
positive degree. Then 



^ N R e I 



R^oo jv^^ ^r=li=l 



= 



where fi = E(/i|XTj. 
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7.4. Proof of Theorem 2.3 in the general case. 



Proof of Main Theorem in the general case. Without loss of generality we can assume that 
-< 0£_i -< ■ ■ ■ ~< ai- If all functions oi, . . . have degree 0, then the result follows from 

Theorem 2.7 in [17]. If all functions ai, . . . , a£ have positive degree, then the result was proved 

in the previous subsection. Hence, we can assume that there exists m G {1, — 1} such 

that deg(ai) = for z = m + 1, . . . , £ and deg(aj) > 1 for i = 1, . . . , m. 
It suffices to show that if = for some z G {1, . . . , m}, then 



(37) 



lim sup 

N—^oo 



N e 

n=l i=l 



ai(n)] 



fr 







L2(m) 



where the convergence takes place in L^(|u). For every i? G N the limit in (37) is equal to 

^ N R i 
n=l r=l 1=1 



(38) 



lim sup 



Since the functions a^+i, ■ ■ ■ ,a£ G Ti have degree 0, it is easy to see the following (one uses 
that their derivative converges to and the mean value theorem): for every i? G N, for a set of 
n G N of density 1, we have [ai{nR + r)] = [ai{nR)] for r = 1, . . . , R and i = m + I, . . . ,1. We 
deduce that the limit in (38) is equal to 

N e ^ R m 



lim sup 

N-^oo 



n=l i=m+l ^ r=l i=l 



N 



R m 



n=l i=m+l 

This is bounded by a constant times 

lim sup — / 

Using Proposition 7.7 we see that the limit of this expression as R 
completes the proof. 



L2(/.) 



}_^Y^j,[ai{nR+r)] ^. 



GO is equal to 0. This 
□ 
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